Skip to content

Conversation

@chris-maes
Copy link
Contributor

@chris-maes chris-maes commented Jul 8, 2025

This PR contains multiple performance improvements to the dual simplex code including:

  • O(N^2) -> O(N) bound-flipping ratio test
  • Exploiting hypersparsity (sparse factors plus sparse right-hand side) throughout the solver,
  • New middle-product form basis update (replaces Forrest-Tomlin since it currently handles hypersparsity better)
  • Maintaining and update a list of primal infeasibilites

New or existing test cover these changes. No changes in documentation are necessary.

Other experiments included in this PR:

1) Bound strengthing on CPU for dual simplex. Note that this
did not lead to an improvement on the NETLIB LP test set.

2) Attempt at O(N) bound-flipping ratio test using bucket sort.
This is stil a work in progress.

3) Attempt to compute Farkas certificate when no entering variable
found in dual simplex ratio test. This has not been verified yet.

Note that the maros NETLIB problem is classifed as infeasible
with the list of primal infeasibilites and the updated pricing.
As far as I can tell this is due to different choices for leaving
variables in the pricing---that have the same score.

To try to handle no entering variables better, I tried to
remove the dual perturbation and reset the steepest edge.
This allows me to continue on maros. But ultimately the problem
is still classified as infeasible.
@copy-pr-bot
Copy link

copy-pr-bot bot commented Jul 8, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@chris-maes chris-maes added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Jul 8, 2025
@chris-maes chris-maes added this to the 25.08 milestone Jul 8, 2025
@chris-maes chris-maes self-assigned this Jul 8, 2025
@chris-maes chris-maes marked this pull request as ready for review July 23, 2025 22:07
@chris-maes chris-maes requested review from a team as code owners July 23, 2025 22:07
@chris-maes
Copy link
Contributor Author

/ok to test efb768c

@chris-maes
Copy link
Contributor Author

/ok to test 3905665

Copy link
Member

@KyleFromNVIDIA KyleFromNVIDIA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved trivial CMake changes

@chris-maes
Copy link
Contributor Author

/ok to test 2230fbc

Copy link
Contributor

@aliceb-nv aliceb-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the awesome work Chris!
Just a few minor comments :)


phase2::reset_basis_mark(basic_list, nonbasic_list, basic_mark, nonbasic_mark);

std::vector<bool> bounded_variables(n, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::vector<bool> tends to be somewhat slower (as it's implemented through a bitmap), I'm not sure if this is anything close to a hotspot but for these use cases a std::vector or std::vector<uint8_t> work better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes. Good catch. The bitmap might slow down the access. I switched to uint8_t as suggested.

Comment on lines +72 to +75
assert(row_permutation_.size() == m);
assert(rhs.n == m);
assert(solution.n == m);
assert(Lsol.n == m);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those asserts will only be compiled when building in debug mode which we rarely ever do, we might want to use cuopt_assert instead (which is controlled by the "-a" flag in ./build.sh)

std::vector<i_t>& infeasibility_indices,
f_t& primal_inf)
{
const f_t now_feasible = std::numeric_limits<f_t>::denorm_min();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are denormals useful here? I think double should have enough precision, computations involving denormals usually imply a performance hit
As an aside: we might want to measure the performance/quality impact of flushing denormals to zero at some point for the CPU code, just in case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not using the denormals for computation here. I was using this small value as a marker in the squared_infeasibilities array. If squared_infeasibilites[j] == now_feasible, than variable j has become feasible, and it can be removed from the infeasibility_indicies list. I think it is fine to just set the value to 0.0.

void clean_up_infeasibilities(std::vector<f_t>& squared_infeasibilities,
std::vector<i_t>& infeasibility_indices)
{
const f_t now_feasible = std::numeric_limits<f_t>::denorm_min();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed denormals. See above.

std::vector<f_t> my_delta_y;
delta_y_sparse.to_dense(my_delta_y);

// TODO(CMM): Do I use the perturbed or unperturbed objective?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this TODO been addressed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, no. I'm still not sure which to use here. Right now the objective value is just for printing in the logs. So it doesn't affect the solution.

Comment on lines +2743 to +2744
// TODO(CMM): Do I also need to update the objective due to the bound flips?
// TODO(CMM): I'm using the unperturbed objective here, should this be the perturbed objective?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this TODO been addressed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, no. Similar to the above I'm not sure what to do here. Luckily, it only affects the objective in the logs.

Comment on lines 186 to 188
for (i_t k = 0; k < i.size() - 1; ++k) {
if (i[k] > i[k + 1]) { printf("Sort error %d %d\n", i[k], i[k + 1]); }
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just another tiny FYI: std::is_sorted is useful for this (in asserts and similar)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I switched to std::is_sorted.

@chris-maes
Copy link
Contributor Author

/ok to test 0c4459e

Copy link
Contributor

@aliceb-nv aliceb-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved! Thanks a lot for the great work Chris!

@tmckayus
Copy link
Contributor

This is not critical but is highly, highly desirable from a perception basis

@chris-maes
Copy link
Contributor Author

/ok to test a939cf1

@chris-maes
Copy link
Contributor Author

/ok to test 458e1dd

@chris-maes
Copy link
Contributor Author

/ok to test 9dc4329

@chris-maes
Copy link
Contributor Author

/ok to test 7144d51

@chris-maes
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit b965d80 into NVIDIA:branch-25.08 Jul 31, 2025
73 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants